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1. Introduction 



The fitness landscape was originally intended as a simple metaphor for an intuitive 
understanding of adaptation ( Wright] 1931[ ). Adaptation can be pictured as an uphill 



walk in the fitness landscape, where height represents fitness and where each step is 
between similar genotypes. The concept of a fitness landscape has been formalized in 
somewhat different ways (Beeren winkel et al.| 2007~c]) an d the current theory is exten- 
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in the field was primarily motivated by theoretical considerations, such as the relation 
between global and local properties of fitness landscapes. However, it may not be clear 
if the classical models apply in a particular empirical context. The underlying assump- 
tions, such as a block structure of the fitness landscape, may or may not hold. 

Some recent approaches do not make any structural assumptions about the fitness 
landscapes. We will consider the geometric theory of gene interactions and fitness 
graphs. We define fitness as the logarithm of the expected reproductive success. There 
are different definitions of fitness in the literature ( Mani et. al| 2008). Epistasis means 



that fitness is not linear. For instance, the combination of two beneficial mutation may 
result in a double mutant with much higher fitness, as compared to a linear expectation 
from the fitness of the wild-type, and the two single mutants. Such positive epistasis is 
common for drug resistance mutations, for example antibiotic resistance mutations (e.g. 
Goulart et al.[ 2012| ). It is not difficult to analyze the two-loci case, but it is less obvious 



how to quantify, classify and interpret epistasis for several loci. 

The most fine-scaled approach to gene interactions is the recently developed geomet- 
ric theory ( |Beerenwinkel et aL] 2007 b| ). The theory extends the usual concept epistasis 



for two mutations to any number of loci in the strict sense that all gene interactions are 
reflected. The shapes, as defined in the geometric theory, has the role of positive and 
negative epistasis for two mutations. 

In contrast to the sensitive shape analysis, a fitness graph is determined by the fitness 
ranks of the genotypes only. Qualitative information such as if "good+good=better" or 
"good+good= not good" for two single mutations are reflected by the fitness graphs. 
From the graphs one can immediately understand the coarse properties of the land- 
scapes, including the number of peaks. We argue that both the geometric theory and 
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fitness graphs are well suited for empirical work. Moreover, to some extent shapes and 
fitness graphs provide complementary information. Shapes are relevant for recombina- 
tion and fitness graphs for mutational trajectories. 

In many real populations at most two alternative alleles occur at each locus, or a 
biallelic assumption is a reasonable simplification. Throughout the chapter, we will 
consider biallelic L-loci populations. Let £ = {0, 1} and let S L denote bit strings of 
length L. S L represents the genotype space. In particular, 

S 2 = {00, 01, 10, 11} and S 3 = {000, 001, 010, 011, 100, 101, 110, 111}. 

The zero-string denotes the string with zero in all L positions, and the 1-string denotes 
the string with 1 in all L positions. We define a fitness landscape as a function w : S L i— > E, 
which assigns a fitness value to each genotype. The fitness of the genotype g is denoted 
w g . The metric we consider is the Hamming distance, meaning that the distance be- 
tween two genotypes equals the number of positions where the genotypes differ. In 
particular, two genotypes are adjacent, or mutational neighbors, if they differ at exactly 
one position. A walk in the fitness landscape corresponds to a Darwinian process in 
a precise way. Consider a population after a recent change in the environment. As- 
sume that the wild-type no longer has optimal fitness. If we assume the strong-selection 
weak-mutation (SSWM) regime, then a beneficial mutation is assumed to go to fixation 
in the population before the next mutation occurs. The population is monomorphic for 
most of the time, so that one genotype dominates the population at a particular point 
in time. Consequently, we can think of a Darwinian process as an adaptive walk in the 
fitness landscape, where each step represents that a beneficial mutation goes to fixation 
in the population. The described model of adaptation has been widely used and relies 
on approaches developed in Gilliespie (1983. 1984[ ); Maynard Smith ( |1970 ). 



The chapter is structured as follows. The topic for Section 2-5 is fitness graphs, where 
most results depend on Crona et al.| ( 2013) . The topic for Section 6-10 is the geometric 



theory of gene interactions, where most results depend on Beerenwinkel et al.| (2007 
|b]), and triangulations of poly topes ( De Loera et al.|[2010[ ). Section 11 compares fitness 



graphs and shapes, as defined in the geometric theory. Section 12 is a discussion. 

2. Fitness graphs and sign epistasis 

The concepts of the landscape metaphor can be made precise. An adaptive step in the 
fitness landscape corresponds to a change in exactly one position of a string so that the 
fitness increases strictly. An adaptive walk is a sequence of adaptive steps. A peak in the 
fitness landscape has the property that there are no adaptive steps away from it, i.e., a 
genotype is at a peak if all mutational neighbors have lower fitness as compared to the 
genotype. The following concepts are central as well, in particular they are useful for 
relating the number of peaks to local observations. 

For L > 2, given a string and two positions, exactly four strings can be obtained 
which coincide with the original string except (at most) at the two positions. Denote 
such a set of four strings 

ab, Ab, aB, AB, 
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FIGURE 1. The arrows point toward the more fit genotype. The graphs 
represent no sign epistasis, two cases with sign epistasis but not reciprocal 
sign epistasis, and one case with reciprocal sign epistasis. 



according to the two positions of interest, and assume that w ab is minimal. Sign epistasis 
means that 

wab < w Ab or w AB < w aB . 
Reciprocal sign epistasis interactions means that 

wab < w Ab and w AB < w aB . 
Fig. 1 shows the four possibilities under our assumption that w a b is minimal. 



Sign epistasis is by no means rare for microbes according to several studies (e.g. Des 
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antibiotic resistance mutations, as well as for HIV and malaria. In fact, existing stud- 
ies suggest that absence of sign epistasis is exceptional for systems associated with drug 
resistance for L > 4. 

Sign epistasis is of clinical importance for several reasons. A recent approach for pre- 
venting and managing resistance problems takes advantage of both sign epistasis and 
variable selective environments ( |Goulart et al. 2012). Another aspect of managing drug 
resistance is to find constraints for orders in which mutations accumulate from geno- 
type data (Desper et al. 1999| [Beerenwinkel et al. 2007 a). A constraint could be that 
a particular mutation is selected for only if a different mutation has already occurred. 
The existence of constraints implies sign epistasis. Indeed, if a particular mutation is 
beneficial regardless of background, then it can occur before or after other mutations. 
Moreover, sign epistasis is relevant for predictions of how populations will adapt ( |Wein-| 
reichet al4|2006"l ). 

Fitness graphs are useful for the empirical problems mentioned, as well as for more 
theoretical problems, including the relation between global and local properties of fit- 
ness landscapes (see Section 3). A fitness graph compares the fitness ranks of mutational 
neighbors. For simplicity, whenever we use fitness graphs we assume that w s ^ w s i for 
any two strings s and s' which differ in one position only. 

Roughly, consider the zero-string as the starting point (possibly the wild-type), and 
each non-zero position of a string as an event, i.e., that a mutation has occurred. Under 
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these assumptions the fitness graph coincides with the Hasse-diagram of the power set 
of events, except that each edge in the Hasse-diagram is replaced with an arrow toward 
the string with greater fitness. 

For a formal definition, a fitness graph is a directed graph where each node corre- 
sponds to a string of S L . The fitness graphs has L + 1 levels. Each string such that 

Si = I corresponds to a node on level I in the fitness graph. In particular, the node 
representing the zero-string is at the bottom, the nodes representing strings with exactly 
one non-zero position, including 10 • ■ ■ 0, are one level above, the nodes representing 
strings with exactly two non-zero positions, including 110 ■ ■ ■ 0, are on the next level, 
and the 1-string is at the top. Moreover, the nodes are ordered from left to right accord- 
ing to the lexicographic order where 1 > of the corresponding strings (see e.g. Fig. 5). 
A directed edge connects each pair of nodes such that the corresponding strings differ 
in exactly one position. The edge is directed toward the node representing the more fit 
of the two genotypes. 

Remark 1. Unless otherwise states, the words "level" , "up" , "down" "above" and "below" 
refer to fitness graphs. In particular, notice that a higher level does not imply greater fitness. 

For L > 2, given a string and two positions, consider the four strings which coincide 
with the original string except in (at most) the two positions. We call the strings a type 
2 system if there is reciprocal sign epistasis, a type 1 system if there is sign epistasis, but 
not reciprocal sign epistasis, and a type system if there is no sign epistasis. 

For interpretations of general fitness graphs, it may be helpful to first analyze the two- 
loci case shape in some detail. There exist exactly 14 fitness graphs for biallelic two-loci 
systems (see Fig. 2), where 4 are type systems, 8 type 1 systems, and 2 type 2 systems. 
One verifies the following result. 

Remark 2. For two-loci, type 0, 1, and 2 systems have the following properties. 

(1) A type system can be rotated so that all arrows point up. 

(2) A type 1 system differs from a cycle by exactly one arrow. 

(3) A type 2 system have two nodes such that all edges are directed toward them, and two 
nodes such that no edges are directed toward them. 

The observations from the two-loci case should make it easy to identify type 0, 1 and 
2 systems for general fitness graphs. Fig. 3 and 4 show fitness graph for 3-1 oci systems. 
Fig. 3a has type systems only, Fig. 3b type and 2 systems, Fig. 4a type 0, 1 and 
2 systems, and Fig. 4b type 2 systems only. Fig. 5 shows a fitness graph for a 4-loci 
population, where there are several type 2 systems, including 0001, 0101, 0011, 0111. 



3. Fitness graphs and theoretical results 



Fitness graphs have mostly been used in empirical work ( de Visser et al. 2009||Franke 



et al.[ 2011 Szendro et al.J . 2012||Goulart et al.j 2012J . e.g.). However, we will indicate how 



they can be used in theoretical arguments, and mention some results where the proofs 
depend on fitness graphs. 
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FIGURE 2. For a fitness graph, the arrows point toward the genotype of 
greater fitness. There exist exactly 14 fitness graphs for biallelic two-loci 
systems, where the type systems are on the first row, the type 1 systems 
on the second row, and the type 2 systems on the third row. 



FIGURE 3. A fitness graph shows sign epistasis and the peaks. The graph 
in Fig. 3a has type systems only. The graph in Fig. 3b has type and 
type 2 systems, but no type 1 systems. 



It is known that one can have 2 L 1 peaks in a fitness landscape (e.g. 
and this number is an upper bound. The proof is elementary, and we will not give the 
details. However, we will construct fitness landscapes with the maximal number of 
peaks using fitness graphs. 



Haldane, 1931) 
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FIGURE 4. The graph in Fig. 4a has type 0, type 1 and type 2 systems. The 
graph in Fig. 4b has type 2 systems only, and the corresponding fitness 
landscape has four peaks. 




FIGURE 5. The fitness landscape has peaks at 1100, 0011 and 1111, 
whereas all triple mutants (mutants on the third level) have low fitness 

Example 1. For any L, consider the fitness graph where the edges are directed upfront level 
to 1, down from level 1 to 2, upfront level 2 to 3, and so on. The fitness graph in Fig. 4b is an 
example. Notice that the graph corresponds to fitness landscapes with 4 peaks, i.e., the maximal 
number of peaks for L = 3. In general, all nodes at level 1, 3, 5 . . . are at peaks, and such fitness 
graphs correspond to fitness landscapes with exactly 2 L ~ l peaks. 
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Recent work relates global and local properties of fitness landscapes ( Poelwijk et al 
2007| 2011| Crona et al. 2013[ ). This topic is of interest, since most empirical studies of 



fitness landscapes concern local properties, including sign epistasis. It has been shown 
that multipeaked fitness landscapes have type 2 systems ( |Poelwijk et al.| 2011[ ). The 
converse is not true. However, a sufficient condition for multiple peaks can be phrased 
in terms of type 1 and 2 systems. More precisely, the following result was proved using 
fitness graphs. 



1. (Crona et al.,2013) If a fitness landscape has type 2 systems and no type 1 systems, then it 
has multiple peaks. 



It follows that the landscapes corresponding to Fig. 3b and 4b have multiple peaks. 

Fitness graphs are efficient for analyzing mutational trajectories. We will state a result 
regarding accessible mutational trajectories from Weinreich et al.| ( 2005[ ). A brief proof 
of the result using fitness graphs was given in Crona et al. l 2013[ ), but the original proof 
does not use fitness graphs. 

We refer to the global maximum of the landscape as "the fitness peak". Moreover, 
define a general step similar to "adaptive step", except that the fitness may decrease. A 
general walk, as opposed to "adaptive walk" is a sequence of general steps. If a general 
walk between two nodes has minimal length, we call it a shortest walk. 



2. (Weinreich et al v 2005) 

(1) The following conditions are equivalent for a fitness landscape. 

(i) Each general step toward the fitness peak, i.e., a step that decreases the graph theo- 
retical distance to the peak, is an adaptive step. 

(ii) Each shortest general walk to the fitness peak is an adaptive walk. 

(iii) The fitness landscape has no type 1 or 2 systems. 

(2) If the equivalent conditions in (1) are satisfied, then each adaptive walk to the fitness peak 
is a shortest general walk. 



A fitness landscape satisfying the equivalent conditions (i)-(iii) above is referred to 



as a fitness landscape lacking genetic constraints on accessible mutational trajectories in Wein- 



reich et al. ( |2005[ ). For L = 3, the fitness graph in Fig. 3a corresponds to this category 



of landscapes. Fitness landscape lacking genetic constraints on accessible mutational 
trajectories, can be represented by fitness graphs where all arrows are up. For brevity, 
we will refer to "all arrows up landscapes". 

It is important to notice that the concept of an all arrows up landscape is biologically 
meaningful. Even if a landscapes is single peaked, type 1 systems may cause the adap- 
tation process to be slower since not all shortest general walks to the peak are adaptive 
walks. However, for all arrows up landscapes, there are no local obstacles for the adap- 
tation process. 
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4. Fitness graphs and recombination 

Recombination can generate new genotypes in a population. Under some circum- 
stances, recombination will speed up adaptation. An early hypothesis about the pos- 
sible advantage of recombination concerned double mutants of high fitness, where the 
corresponding single mutations are deleterious. It was suggested that recombination 
could generate such double mutants. In terms of fitness graphs this case can be de- 
scribed as a type 2 system, where the wild-type is at a fitness peak. However, the hy- 
pothesis was immediately criticized, and described as a "widespread fallacy" by Muller 
( |Crow| 2006). The two single mutations being deleterious, it seems unlikely that the the 



corresponding genotypes would appear and recombine to the double mutant. The (cur- 
rent) consensus is that under most circumstances recombination will not be of any use 
in the situation described, i.e., for a two-loci type-2 system, where the wild-type is at a 



peak (see also Lenski et al. (2003)). However, using fitness graphs we will argue that 



recombination could be an advantage in somewhat related cases where L > 3. 

The topic of recombination is involved with subtle differences between effects on the 
population level and the gene level. For instance, it is theoretically possible that recom- 
bination is beneficial for a population and at the same time recombination suppressors 



could be selected for (see e.g. Otto and Lenormand ( 2002[ ) for comments and refer 



ences). We do not intend to develop new theory, or describe existing knowledge of 



recombination in any detail. For and overview of the field, we refer to Otto and Lenor- 
mand ( 2002[ ). Our goal is to point out mechanisms specific for L > 3 loci which should 



be considered for an analysis of the effect of recombination. This is justified since the 
field is dominated by work in the two-loci case, or mechanisms which can be reduced 
to the two-loci case. 

It has been suggested that recombination has an especially strong impact in structured 
populations, see e.g. Martin et. al ( 2006[ ). More precisely, for a population subdivided 



into local subpopulations, with some migration between them, recombination could be 
advantageous. We will sketch a model within this framework, which we call a puddles 
and flood population. We mainly have microbes in mind, for example bacteria. Assume 
that the local subpopulations live in puddles, and the subpopulations are homogeneous 
for most of the time. Occasionally, there is a flood where the contents of the local pud- 
dles get thoroughly mixed. After a flood, life proceeds as usual in the puddles for an 
extended period, until the next flood. Under these assumptions, genotypically different 
subpopulations are likely to mix, so that recombination can generate new genotypes. 



Example 2. Consider the fitness graph in Fig. 5, and assume that 0000 is the wild-type. Both 
1100 and 0011 are at peaks, whereas the triple mutants are less fit as compared to adjacent double 
mutants. For a puddles and flood population, recombination of double mutants may result in 
1111. In this case the advantage of recombination could be substantial. Notice that in the absence 
of recombination, one could obtain 1111 from 1100, only if there is a double mutation, since the 
triple mutants are not fit. 
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Example 3. Consider the fitness graph in Fig. 4b. Assume that 000 is the wild-type. From 
the fitness graph, the singles genotypes 100, 010, 001 are at peaks. Under the assumption that 
111 has maximal fitness, recombination could speed up adaptation. However, notice that two 
recombination events are necessary. For instance, recombination of 100 and 010 could result in 
110 (with relatively low fitness). Then the combination of 110 and 001 could result in 111. 

Notice that there is an important difference between Example 2 and 3. For instance, 
consider the outcome for a puddles and flood population where no more than two pud- 
dles mix at the time. Then one could obtain 1111 by recombination in Example 2, but 
one would probably not obtain 111 in Example 3. 

Consider all arrows up fitness landscapes where the 1-string has maximal fitness. 
Then one could obtain the 1-string from a sequence of single mutations. However, for 
a puddles and flood population, recombination could speed up adaptation. This is 
because the process of accumulating L single mutations could be time consuming. 

Example 4. For an all arrows up L-loci fitness landscape where clearly more than L puddles 
tend to mix during a flood period, one could obtain the 1-string already after one flood period. 

The examples described are theoretical constructions. It is not obvious if Example 
2 and 3, or similar cases, occur frequently enough in nature for having much of an 
impact. A first question to ask for a population, is how frequently it happens that 
"good+good=not good" for single mutations. This type of problems is the topic for 
the next section. 

5. Fitness graphs and other qualitative measures 

In order to determine if one has a reasonable chance to find fitness graphs of the types 
described in the previous sections, the following qualitative concept (Crona et al. 2013[ ) 
may be useful. 

We define B and B p as follows. The set B p consist of all double mutants such that both 
corresponding single mutations are beneficial. 

The set B C B p consists of all double mutants in B p which are more fit than at least one of the 
corresponding single mutants. 

The qualitative measure of the degree of additivity for a fitness landscape is the ratio 

Notice that JJ^r = 1 for additive fitness landscapes. One may consider all arrows up 
landscapes as the qualitative correspondence to additive fitness landscapes. For such 
landscapes ^ = 1 as well. 

Antibiotic resistance landscapes for a particular 4-loci system and 15 selective envi- 



ronments were studied in Goulart et al. (2012). More precisely, the TEM-1 mutations 



L21F, R164S, T265M and E240K were considered. The mean value of }§K for the 15 

| Dp | 

fitness landscapes was 0.61. 

In contexts where the relative fitness values of genotypes are not known, qualitative 
concepts can still be used. Fitness ranks tend to be easier to determine as compared 
to relative fitness, and from records of mutations one can sometimes draw conclusions 
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about fitness ranks without making measurements. We argue can learn a lot about 
fitness landscape from existing records of mutations, in particular from drug resistance 
mutations. However, one needs to be able to interpret qualitative information, a theme 
developed in |Crona et al. ( 2012[ ) with applications to antibiotic resistance, see also |Crona 
etaLl ( |2013l ). 



For a quantitative measure of the degree of additivity, we refer to the concept "rough- 
ness" dCarnerio and Hartl[|201Q}|Aita et aT}|2001|). 



6. Shapes 



As before, we consider biallelic L-loci populations throughout the section. For £ = 
{0, 1} the genotype space of the population is S L . This means that we assume that all 
2 L genotypes occur in the populations we consider. For a comment regrading this sim- 
plification, see Section 10. Most empirical studies of epistasis for several loci focus on 



pairwise gene interactions using ANOVA methods (Beerenwinkel et al. 2007 b ), or the 



average curvature (Fig. 6). For beneficial mutations, antagonistic epistasis means that 
the combined effect of of mutations are less than the sum of individual effects, whereas 
synergistic epistasis means that the combined effect of mutations exceeds the sum of in- 
dividual effects. It has been claimed that antagonistic epistasis dominates for beneficial 



mutations in nature (e.g. Kryazhimskiy et al. 2011). Such epistasis is also called neg- 
ative epistasis. Antagonistic and synergistic epistasis epistasis is defined analogously 
for deleterious mutations (Fig. 6). A motivation for the interest in average effects of 
mutations is the connection to recombination. According to standard models, nega- 



tive epistasis sometimes implies an advantage for recombination (Otto and Lenormand 
2002] ). 



Conventional summary statistics for epistasis have their limitations. The average cur- 
vature may obscure a diversity of interaction types, and pairwise tests fail to discover 
curvature at genetic distances greater than two. The most fine-scaled approach to gene 
interactions is the geometric theory, introduced in Beerenwinkel et al. (2007 b ). The the- 



ory reveals all the gene interactions, and it depends on triangulations of polytopes. For 
mathematical background we refer to De Loera et al. ( 2010[ ), see also |Ziegler (1995) for 
general theory about polytopes. The geometric approach has revealed previously un- 
appreciated gene interactions for HIV, Escherichia-coli and in some other cases (Beeren- 
winkel et al.| 2007 b|c ), and the approach is relevant for recombination. 



We will start with an informal introduction to the geometric theory, where the main 
purpose is to provide an intuitive understanding and some geometric interpretations. 
More formal descriptions are given in the next sections. 

Roughly, a triangulation of a polygon is a subdivision of the polygon into triangles. 
A triangulation of the L-cube is a subdivision of the cube into simplices (triangles if 
L = 2, tetrahedra if L = 3, pentachora for L = 4, and so on). We will use some concepts 
which are defined in terms of populations. If one groups individuals into classes of 
identical genotypes, a population can be described as the frequencies of the genotypes. 
The fitness of a population is defined as the average fitness of all individuals. 
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FIGURE 6. The number of mutations increases along the horizontal axis, 
and the fitness increases along the vertical axis. For Fig. 6a, the mutations 
are beneficial. The upper curve corresponds to antagonistic epistasis, and 
the lower curve to synergistic epistasis. For Fig. 6b, the mutations are 
deleterious. The upper curve corresponds to synergistic epistasis and the 
lower curve to antagonistic epistasis. 

First consider the case L = 2. Then the genotope is the square with vertices 00, 01, 10, 11. 
We denote this genotope [0, l] 2 , and interpret a point v = (vi,v 2 ) € [0, l] 2 as the allele 
frequencies of the population, where v\ denotes the frequency of l's at the first locus, 
and v 2 the frequency of l's at the second locus. Let 

A = {(Poo,Poi,Pio,Pn) e [0, l] 4 : Poo + Poi + Pio + Pu = 1} 
denote the population simplex. A population is given as a point in A. 

Example 5. Consider v = (0.4,0.8) e [0, l] 2 and the populations p 1 = (0.2,0.4,0,0.4) e A 
and p 2 = (0, 0.6, 0.2, 0.2) e A. One verifies that both populations have the the allele frequencies 
described by v; indeed, adding the contributions of l's for p 1 and the first locus gives + 0.4 = 
0.4, and for the second locus 0.4 + 0.4 = 0.8. The contributions for p 2 gives 0.2 + 0.2 = OA for 
the first locus, and 0.6 + 0.2 = 0.8 for the second. 

Let p denote a corresponding map from the population simplex A to the genotope 

[0,1] 2 . 

p(Poo,Poi,Pio,Pn) = (pio + Pu,Poi +Pu) 
Then p maps a point of the population simplex to the allele frequencies, where p w + pn 
equals the frequency of l's at the first locus, and poi + Pn equals the frequency of l's at 
the second locus. Notice that pip 1 ) = p(p 2 ) = v in the previous example. 

Given a fitness landscape and a vector v e [0, l] 2 , the fittest population p e A has max- 
imal fitness among populations such that the allele frequencies are described by v, and 
p is unique. For a fittest population, one cannot increase the fitness by shuffling around 



12 



KRISHNA CRONA 



alleles. The biological significance is immediate, since such allele shuffling relates to 
recombination. 

We will give a description of the triangulation induced by w for L = 2. This triangu- 
lation is the shape of the fitness landscape. The critical property of the triangulation is 
that for v G [0, l] 2 , the genotypes that occur in the fittest population are the vertices of 
the triangle which contains v. The corresponding result holds for any L. We will first 
describe the triangulations, and then give a geometric interpretation of shapes. Notice 
that fitness is additive exactly if 

w u - w w - w i + w 00 = . 

Case 1: (positive epistasis) If 

w n - w 10 - w i + woo > 0, 

then the triangulation induced by the fitness landscape has 00 — 11 diagonal, meaning 
that the triangles are {00, 01,11} and {00, 10, 11} (Fig. 7). 
Case 2: (negative epistasis) If 

w u - w 10 - Woi + Woo < 0, 

then the induced triangulation of the genotope has 10 — 01 diagonal meaning that the 
triangles are {00, 01, 10} and {01, 10, 11} (Fig. 7). 

For a geometric interpretation, consider the genotope [0, l] 2 and the four points above 
the vertices of [0, l] 2 , such that the height coordinates corresponds to fitness. The four 
points are vertices of a tetrahedron (Fig. 7). The upper sides of the tetrahedron (marked 
with different patterns) project onto two triangles of [0, l] 2 . The projections describe the 
triangulation induced by w. The left picture corresponds to positive epistasis, and the 
right to negative epistasis. This construction should make sense, since the triangulation 
obtained as projections of the upper faces of the tetrahedron has the critical property for 
all fittest populations. More precisely, for any v G [0, l] 2 , the fittest population consists 
of vertices of the triangle which contain v. The fitness landscape almost always induces 
a triangulation of the genotope [0, l] 2 as described. Such a triangulation is a generic 
shape. The exceptional (non-generic) case is when fitness is additive, so that 

Wu - w w - Woi + Woo = 0. 

For positive epistasis (Case 1), notice that p 1 in Example 5 is a fittest population. In the 
case with positive epistasis for L = 2, the genotypes 10 and 01 are not on the same tri- 
angle. Recombination of 10 and 01 genotypes resulting in 00 and 11 genotypes, implies 
increased average fitness of the population. 

In general, consider a biallelic L-loci system. The genotope is the L-cube [0, 1] L , where 
the vertices represent the genotypes. As in the two-loci case, let A denote the population 
simplex and let p denote the corresponding map from A to [0, 1] L . For a fixed v G [0, 1] L , 
consider the linear programming problem 

max {p ■ w : p(p) = v}. 
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FIGURE 7. The upper pictures shows the triangulations of the genotopes 
(the squares) in Case 1 (positive epistasis) and 2 (negative epistasis). The 
lower left picture shows the tetrahedron above the genotope in Case 1, 
where the height coordinates correspond to the fitness of the four geno- 
types under consideration. The projections of the upper sides of the tetra- 
hedron describe the triangulations. The lower right picture shows how 
the triangulation is induced in Case 2. 

A solution gives the maximal population fitness, i.e., the maximum of p ■ w, given the 
allele frequency vector v (since p(p) = v). Consequently, finding the fittest population 
translates to solving this linear programming problem. 

If we let v vary, we get the following parametric linear programming problem 



w(v) = max{ p- w : p(p) = (v) for all v G [0, 1] L }. 
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FIGURE 8. A non-regular triangulation. This triangulation cannot be in- 
duced from a fitness landscape. 



The domains of linearity of w do almost always constitute a triangulation of the geno- 
tope (D e Loera et al.| 2010J . Chapter 2). The shape of the fitness landscape is the tri- 
angulation of [0,1]^ induced by the fitness landscape w (see also the next section). The 
geometric interpretation is analogous to the two-loci case, so that the triangulation is ob- 
tained as the projections of the upper faces of the polytope constructed from the fitness 
landscape. 

For the two-loci case, the geometric theory does not contribute anything new, since 
there exist only two triangulations corresponding to the two types of epistasis in the 
usual sense. However, for L = 3, there are 74 generic shapes corresponding to triangu- 
lations of the cube (see Section 9). 

Not all triangulations can be obtained from a fitness landscape. A triangulation is reg- 
ular if it is induced by some fitness landscape. Fig. 8 shows a non-regular triangulation. 
This is the smallest non-regular triangulation. In the literature, a regular triangulation 
is described as a triangulation which is induced by a cost vector. Then the linear pro- 
gramming problem concerns minimizing the cost, and the triangulations are obtained 
as projections of all lower faces of the polytope constructed from the cost vector. Since 
our topic is fitness landscapes, we think in terms of maximal fitness rather than minimal 
costs. 



7. Shapes and flips 

For L > 2, there are many possible shapes. It may seem that shapes are difficult to 
apply in empirical biology, due to ambiguity from measurement errors. However, the 
geometric theory comes with a structure. Shapes may be similar or completely different, 
and the relation between shapes can be described in a systematic way. We start with 
intuitive descriptions. Briefly, a flip, sometimes referred to as a geometric bistellar flip, 
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FIGURE 9. The left triangulation can be transformed into the right trian- 
gulation by a flip. 



is a minimal change between triangulations. Fig. 9, 10 and 11 show flips. For the two- 
loci case, the two triangulations corresponding to positive and negative epistasis differ 
by a flip. 

For an overview of how all triangulations of a polytope are related, one can consider 
the flip graph. The nodes of the graph are the triangulations, and edges connect trian- 
gulations which differ by a flip. Fig. 12. shows the flip graph of a hexagon. The graph 
theoretical distance between triangulations can be considered a measure of how closely 
related the triangulations are. Some caution is necessary if one is primarily interested 
in regular triangulation, since a regular triangulation may be transformed into a non- 
regular triangulation by a flip. 

8. Shapes and polyhedral subdivisions 

Given a genotype space S L , consider all possible shapes induced by fitness land- 
scapes w : S L i-> K. We will describe how the shapes are related. Most results depend 
on triangulations of polytopes. In particular, we will discuss the secondary polytope 
( jGelfand et al.| 1994), an important construction in discrete mathematics. The secondary 
polytope is useful for a global understanding of shapes. We will not provide proofs, 
but rather describe results and how they apply to epistasis. For a thorough treatment, 
including proofs, see De Loera et al. ( |2010[ ) and for the biological perspective Beeren- 
winkel et al. ( |2007 b ). Some definitions below may seem technical, but the figures and 
intuitive descriptions from the previous sections should help. 

Throughout the section, Let A e IR d denote a finite point set. A polytope is the convex 
hull of a point set. Polytopes include points, line segments, triangles and tetrahedra, as 
well as L-cubes and polygons. We will use some concepts expressed in terms of point 
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FIGURE 10. The dashed lines indicate the triangulations. The triangula- 
tions differ by a flip. 




FIGURE 11. The triangulations differ by a flip, and the number of trian- 
gles are different for the two triangulations. 

sets, although we have polytopes in mind. In particular, a triangulation of a polytope is 
a triangulation of the set of its vertices, and similarly for the other concepts. 

A A;-simplex is the convex hull of k + 1 affinely independent points. In particular, 
points, segments, triangles and tetrahedra are simplices. A j-face of a A;-simplex is the 
convex hull of a subset of j vertices. 

We will give a formal definition of triangulations and some related concepts. A poly- 
hedral subdivision of a point set A is a collection of polytopes C, such that 

(i) If C e C then each face of C belongs to C as well (closure property), 



POLYTOPES, GRAPHS AND FITNESS LANDSCAPES 



17 




FIGURE 12. The flip graph of a hexagon. 



(ii) the union UcecC = conv(A) (union property), 

(iii) for C^C" where C, C G C, the intersection C fl C does not contain any interior 
points of C or C (intersection property). 

A triangulation of A is a polyhedral subdivision such that all polytopes are simplices. A 
refinement C of a polyhedral subdivision C is a polyhedral subdivision C where for each 
C G C , there exists a C G C, such that C C C. A polyhedral subdivision is an almost 
triangulation if it is not a triangulation, but all its proper refinements are triangulations. 
Two triangulations of the same point set are connected by a flip if they are the only two 
triangulations refining an almost triangulation. All these concepts are illustrated in Fig. 
13 and 9. Specifically, Fig. 13 shows a polyhedral subdivision which is also an almost 
triangulation. Moreover, the two possible refinements are the triangulations in Fig. 9. 
As mentioned, the triangulations in Fig. 9 differ by a flip, so that the formal definition 
agrees with the descriptions in the previous section. 

From the previous section, a triangulation induced by the fitness landscape is the 
shape of the landscape. More generally, we can describe all shapes using the concepts 
defined in this section. Consider again the parametric linear programming problem 
where v varies: 

w{y) = max{ p ■ w : p(p) = (v) for all v G [0, 1] L }. 



The domains of linearity of w constitute a polyhedral subdivision (De Loera et al. 2010 



chap.2) of the genotope. The shape of a fitness landscape is the polyhedral division 
induced by the landscape. This subdivision is not always a triangulation. Recall from 
the two-loci case that positive epistasis corresponds to u > and negative epistasis to 
u < 0, for 

u = w 00 - woi - w w + w n . 
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FIGURE 13. The polyhedral subdivision is an almost triangulation. The 
two possible refinements result in the two triangulations from Fig. 9. 

However, one does not obtain a triangulation if u = 0. The fitness landscape is generic if 
it induces a triangulation, and the corresponding shapes are the generic shapes. 

In order to further describe the relations between shapes, we will consider minimal 
dependence sets of points, as in the following example. 

Example 6. Consider the vertices of the genotope [0, l] 2 . The relation 

l.(0,0)-l.(0,l)-l.(l,0) + l.(l,l)=0, 

is an affine dependence relation, since the sum of the coefficients 1 — 1 — 1 + 1 is zero. This set of 
four points is a minimal dependence set, in the sense that every proper subset of the four points 
is independent. The form 

corresponds to the dependence relation. Notice that the form is unique up to scaling. 

We define a circuit as a minimal affine dependence set. The corresponding forms are 
called circuits as well, and they are unique up to scaling. 

Flips and circuits are closely related. The circuit u corresponds to the flip between the 
two triangulations of the genotope in the two-loci case. More precisely, the triangulation 
corresponding to positive epistasis is described by u > 0, and the flip corresponds to 
replacing u > by u < 0. In general, a flip corresponds to changing sign of a circuit, 
and some examples for L = 3 are given in the next section. 

The next concept will be used for defining the secondary polytope, and for describing 
flips and circuits in more detail. For a triangulation, we define the GKZ vector as follows: 
The j-th coordinate of the GKZ vector is the sum of the volumes of all simplices con- 
taining the point pj 
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Example 7. The GKZ-vector for the triangulation in the two-loci case associated with positive 
epistasis is (2, 1, 1, 2). Indeed, the triangles have the same area. The vertices 00 and 11 belong 
to two different triangles each, whereas 10 and 01 are "sliced off", so that each of them belong 
to one triangle only. Similarly, the GKZ-vector for the triangulation associated with negative 
epistasis is (1, 2, 2, 1). 

The purpose with the next example is to relate circuits and GKZ vectors. 

Example 8. From the previous example, the GKZ-vector for the triangulation associated with 
positive epistasis is (2, 1, 1, 2), whereas the the GKZ vector for the triangulation associated to 
negative epistasis is (1, 2, 2, 1). We relate to the circuit u = w 00 — w i — w w + w u the vector 
(1, —1, —1, 1). For the flip corresponding to u, 

(2,1,1,2) - (1,2,2,1) = (1,-1,-1,1) 

so that the GKZ vectors differs by the vector corresponding to the circuit. 

In the next section, we will consider the relations between flips, circuits and GKZ 
vectors for L = 3. 

For a given polytope the secondary polytope is defined as the the convex hull of the 
GKZ vectors. The geometric classification of fitness landscapes depends on the sec- 
ondary polytope. For a genotope, the vertices of the secondary polytope correspond 
to generic shapes, and its edges to flips between the generic shapes. The higher dimen- 
sional faces of the secondary polytope correspond to non-generic shapes. Consequently, 
the secondary polytope represent all the shapes and their relations. 

Example 9. The secondary polytope for the two-loci case is a line segment, where the vertices 
corresponds to the two triangulations, and the line segment to the flat shape. 

Naturally, the secondary polytope is related to the flip graph. The 1-skeleton of a 
polytope is the graph consisting of the vertices and the edges of the polytope. It follows 
that the 1-skeleton of the secondary polytope is the subgraph of the flip graph induced 
by the regular triangulations. 
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9. The 74 generic shapes of the cube 

The relations between shapes, circuits and GKZ vector for the 3-cube, is analogous 
to the two-loci case, as indicated in the previous section. Recall that the square has 2 
generic shapes, corresponding to u > and u < 0, for 

u = w 00 - w i - w 10 + w u . 

The cube has 74 generic shapes, where a shape is determined by the following 20 cir- 
cuits. 



a 


■= ^000 


- w io 


- wioo + who 




b 


:= Wooi 


- Won 


- Wioi + w m 




c 


:= wooo 


- Wool 


- wioo + wioi 




d 


:= w 010 


- won 


- who + w m 




e 


■= wooo 


- Wool 


- w io + won 




f 


■= w 100 


- Wioi 


- who + w m 




9 


■= Wqoq 


- won 


- wioo + w m 




h 


■= Wool 


- w io 


- wioi + who 




i 


:= w oo 


- W io 


- Wioi + w m 




3 


:= wool 


- won 


- wioo + w 110 




k 


:= wooo 


- Wool 


- who + w m 




I 


:= Wqio 


- won 


- Wioo + Wioi 




m 


:= Wool + w io + Wioo - w m - 


2w 00 o 


n 


:= won + w wl + w lw - w 00 o - 


2w m 





:= w 010 + w 100 + w m - Wooi - 


2wno 


V 


:= wooo + won + u>ioi - ifluo - 


2w 00 i 


1 


:= wooi + wioo + w m - w io - 


2wioi 


r 


:= w 000 + Won + twn - w wl - 


2w io 


s 


:= wooo + wioi + who - w n - 


2wioo 


t 


:= Wqoi + Wqio + w m - wioo - 


2w n 



We will use the letters a — tin the list, as well as u, throughout the section. 

In order to emphasize the connection to gene interactions, especially the algebraic as- 
pects of epistasis, we will consider the interaction space. For any L, let C be the subspace 
of ]R sL consisting of additive fitness landscapes. The interaction space is the vector space 
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dual to or 



The interaction space is spanned by a set of circuits. The set is canonical up to scaling. 
In the two-loci case, the interaction space is spanned by u. For L = 3, the interaction 
space is spanned by the circuits a — t. 

The circuit sign pattern of a fitness landscape consists of the sign (positive, negative or 
zero) of each circuit. In the two-loci case there is only one circuit and the sign pattern 
is either m > 0, « < or m = 0. A central result for the geometric classification is that 
the circuit sign pattern determines the shape of the fitness landscape, but in general the 
converse does not hold ( Beerenwinkel et al.[ 2007 b| . In particular, the signs of the 20 



circuits a — t determine the shape of the fitness landscape for L = 3. In total, there are 
74 generic shapes. The fact that there are 20 circuits and only 74 generic shapes reflects 
dependence relations. 
Table 1 lists the shapes, where the vertices of the cube are ordered as follows 

000, 001, 010, 011, 100, 101, 110, 111. 

For each shape, the table gives the GKZ vector, the defining inequalities, and the 
adjacent shapes. In particular, for Shape 74 the notation 

acebdf65,66,67,68,69,70, 

in the second column means that Shape 74 is defined by 

a, c, e, b,d, f > 0. 

and that the adjacent shapes are 65,66,67,68,69,70. 

Each inequality of Shape 74 can be described in terms of epistasis (in the usual sense), 
since each inequality keeps one locus fixed. In contrast, the inequalities of Shape 1 
considers three-way interactions. The fact that m > 0, where 

m = wqoi + w 010 + wioo - w m - 2u> oo 

shows that the genotype 111 has lower fitness as compared to a linear expectation from 
the values 

^001,^010,^100,^000- 

This observation shows already that the geometric theory is more fine-scaled as com- 
pared to conventional approaches. 

The 74 shapes fall into six categories, called the interactions types. Specifically, the 
types consist of Shape 1-2, 3-10, 11-34, 35-46, 47-70 and 71-74. For figures of the six 
interaction types, see |De Loera et aE ( 2010[ Chapter 1). Shapes of the same type differ 



only in the labeling of the vertices. In particular, the shapes of the same interaction type 
in the table have GKZ vectors that differ only by a permutation of the components. 

As in the two-loci case, the circuits correspond to flips. The letters representing cir- 
cuits are ordered according to the shapes resulting from the corresponding flips. In 
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TABLE 1. Shape numbers, GKZ vectors, inequalities and adjacent shapes 



# 


GKZ 


inequalities 


# 


GKZ 


circuits 


1 


15515115 


tqom3,4,5,6 


38 


31355313 


lgcd39,44,51,59 


2 


51151551 


srpn7,8,9,10 


39 


31533513 


lief38,44,53,60 


3 


14436114 


tbdel,ll,13,17 


40 


33155133 


|gab42,45,54,61 


4 


14614314 


qbfcl,12,14,18 


41 


33511533 


hiab43,46,55,62 


5 


16414134 


odfal,15,16,19 


42 


35133153 


jkef40,45,57,63 


6 


34414116 


mecal,28,29,31 


43 


35311353 


hkcd41,46,58,64 


7 


41163441 


sacf2,20,22,26 


44 


51333315 


giba38,39,65,68 


8 


41341641 


raed~2,21,23,27 


45 


53133135 


gkdc40,42,66,69 


9 


43141461 


pceb2,24,25,30 


46 


53311335 


ikfe41,43,67,70 


10 


61141443 


nfdb2,32,33,34 


47 


13356222 


dbfel 1,13,35,71 


11 


13446213 


blde3,12,47,51 


48 


13623522 


fbdcl2,14,36,72 


12 


13624413 


blfc4,l 1,48,53 


49 


16323252 


fdbal5,16,37,73 


13 


14346123 


djbe3,15,47,54 


50 


22265331 


cae?20,22,35,71 


14 


14613423 


rhbc4,16,48,55 


51 


22356213 


ebcdll,17,38,71 


15 


16324143 


djfa5,13,49,57 


52 


22532631 


eacd~21,23,36,72 


16 


16413243 


fhda5,14,49,58 


53 


22623513 


cbefl2,18,39,72 


17 


23346114 


egbd3,28,51,54 


54 


23256123 


edabl3,17,40,71 


18 


23613414 


cibf4,29,53,55 


55 


23612523 


cfabl4,18,41,72 


19 


26313144 


akdf5,31,57,58 


56 


25232361 


ecab24,25,37,73 


20 


31264431 


alc?7,21,50,59 


57 


26223153 


adefl5,19,43,73 


21 


31442631 


aled~8,20,52,60 


58 


26312253 


afcdl6,19,43,73 


22 


32164341 


c|af7,24,50,61 


59 


31265322 


fadc20,26,38,71 


23 


32431641 


ehad8,25,52,62 


60 


31532622 


dafe21,27,39,72 


24 


34142361 


cjeb9,22,56,63 


61 


32165232 


fcba22,26,40,71 


25 


34231461 


ehcb~9,23,56,64 


62 


32521632 


deba23,27,41,72 


26 


41164332 


fgac7,32,59,61 


63 


35132262 


bcfe24,30,42,73 


27 


41431632 


diae8,33,60,62 


64 


35221362 


bedc25,30,32,73 


28 


43324116 


egca 6,17,65,66 


65 


52323216 


ceba28,29,44,74 


29 


43413216 


ciea6,18,65,67 


66 


53223126 


aedc28,31,45,74 




/l /l 1 01 Q£1 

44i313oz 


DKcey,J4,bJ,D4 


o/ 


doolZZZo 


actezy, J 1,4b,/ 4 


31 


44313126 


akec6,19,66,67 


68 


61232325 


dfab~32,33,44,74 


32 


61142334 


fgdbl0,26,68,69 


69 


62132235 


bfcd32,34,45,74 


33 


61231434 


difbl0,27,68,70 


70 


62221335 


bdef33,34,46,74 


34 


62131344 


bkfdl0,30,69,70 


71 


22266222 


efdbca47,50,51,54,59,61 


35 


13355331 


ljfe36,37,47,50 


72 


22622622 


cdfbea48,52,53,55,60,62 


36 


13533531 


lhdc35,37,48,52 


73 


26222262 


abfdec49,56,57,58,63,64 


37 


15333351 


jhba35,36,49,56 


74 




acebdf65,66,67,68,69,70 
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particular, the flip corresponding to a results in Shape 65, the flip corresponding to c 
results in Shape 66, and so forth. Similarly, for Shape 65 the notation 

ceba28,29,44,74, 

means that the shape is defined by 

c> 0, e > 0, b > 0, a < 0, 

where a indicates that a < 0. 

For an explicit description, consider 74 and the flip corresponding to a. From the 
table, Shape 74 is defined by 

a > 0, c> 0, e > 0, b > 0, d > 0, / > 0. 

The result of the flip is the shape defined by 

a < 0, c> 0, e > 0, b > 0, d > 0, / > 0, 

which reduces to 

a < 0, c> 0, e > 0, b > 0, 

since the four inequalities imply that d > and / > 0. Shape 65 is described by these 
four inequalities in the table. 

The table lists GKZ vectors as well. Flips and GKZ-vectors are related, as in the two- 
loci case. For instance, the GKZ vector is 62222226, for Shape 74, and 52323216 for Shape 
65. The circuit a corresponds to the vector (1, 0, —1, — , 1, 0, 0, 0), and 

(6, 2, 2, 2, 2, 2, 2, 6) - (5, 2, 3, 2, 3, 2, 1, 6) = (1, 0, -1, 0, -1, 0, 1, 0). 

For a systematic interpretation of t he 20 circuits a — t listed, o ne may consider the 
Fourier transform for the group (Z, 2 ) n (Beerenwinkel et al. 2007 b| . Geometric interpre- 



tations of the circuits are given in the same paper. 

10. Shapes and empirical data 

The described relations between circuits, flips, GKZ-vectors and the secondary poly- 
topes hold under very general assumptions. We restricted our discussion to biallelic 
L-loci populations in order to keep the presentation simple. For the geometric theory of 
gene interactions, the genotope is defined for any set of genotypes found in a population 



and the shape is defined accordingly (Beerenwinkel et aTj 2007 b). In fact, the authors 



stress that the genotope is never an L-cube for binary data and many loci (> 20), which 
is important for complexity reasons. Empirical examples of general type (Beerenwinkel 



|et al.[[2007b|c[ ) can be analyzed similarly to the restricted case we considered here. For a 
shape analysis of empirical data, one needs several fitness measurements of each geno- 
type due to statistical variation. One may not find a unique shape, but rather a set of 
similar shapes which are compatible with the data. 

A shape analysis of HIV fitness data is given in (Bee renwinkel et al.[ 2007 b). The 



biallelic three-loci system considered there is associated with HIV drug resistance. From 
bootstrapping, the three dominant shapes are 2, 7 and 10. Notice that these shapes are 
adjacent, and have similar GKZ vectors. Moreover, the five most dominant shapes 2, 7, 
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10, 26 and 32 appear as a face of the secondary polytope of the cube, and have similar 
GKZ vectors as well. 

Software for analyzing shapes is available, for example Polymake, 

http ?// www.polymake.org/ doku.php[ 

11. Shapes and fitness graphs 

A fitness graph is determined by fitness ranks only. The information from shapes is 
incomparably more fine-scaled. For a thorough comparison between fitness graphs and 



the geometric theory, see Crona et al. (2013). 



We will briefly discuss the two-loci case. Assume that the 11 genotype has maximal 
fitness. Then positive epistasis is compatible with three fitness graphs (no arrows down, 
exactly one arrow down, or two arrows down). On the other hand, consider the fitness 
graphs with all arrows up. Such a graph is compatible with positive, negative or no 
epistasis. This example shows that fitness graphs provide information that cannot be 
obtained from the geometric classification, and vice versa, and the same observation 
holds for any L. Moreover, there is usually an overlap in the information from fitness 
graphs and shapes. If all arrows point away from a particular genotype, and if the 
genotype is "sliced off" for the shape, then one has two indications that the genotype 
has relatively low fitness. 

From a more philosophical perspective, the interest in fitness graphs and shapes de- 
pends on the belief that average effects of mutations are insufficient for analyzing evo- 
lutionary dynamics. 



12. Discussion 

We have considered fitness graph and the geometric theory of gene interactions. Fit- 
ness graphs and shapes provide complementary information, and there tend to be some 
overlap in the observations. Fitness graphs are useful for analyzing peaks, sign epis- 
tasis, mutational trajectories, and other coarse properties of fitness landscapes. The 
graphs have been used in empirical work, and for relating global and local properties 
of fitness landscapes. 

The geometric theory extends the usual concept epistasis to any number of loci, where 
shapes, as defined in the geometric theory, correspond to positive and negative epistasis 
for two mutations. The geometric classification is meaningful because it comes with a 
structure. A particular shape can be put in a context, and compared to other shapes. 
In summary, for biallelic populations where all 2 L genotypes are represented, the geno- 
tope is an L-cube. The shape of a fitness landscape is a polyhedral subdivision of the 
genotope induced by the landscape. The generic shapes are the triangulations of the 
genotope. The relation between all the generic shapes can be described in terms of 
flips, or minimal changes between shapes. The flip graph provides an overview of the 
generic shapes and how they can be transformed into each other by flips. The secondary 
polytope encodes all shapes and their relations, where the generic shapes correspond to 
vertices, and the non-generic shapes to the higher dimensional faces. For an algebraic 
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perspective, the interaction space is spanned by a canonical set of linear forms, or cir- 
cuits. The shapes are determined by the sign pattern of the circuits, and changing sign 
of a circuit corresponds to a flip. 

The geometric theory has provided new insights about gene interactions in empirical 
studies. The theory may be considered a fundamentally new approach to recombina- 
tion. There is clearly a potential for new applications of shapes to evolutionary biology 
as well as various empirical problems, even if the theory is complete. 

The approaches discussed here are similar in one respect. They make no assumptions, 
or minimal assumptions, about the underlying fitness landscapes. The accuracy of an 
analysis of empirical data using fitness graphs or the geometric theory does not depend 
on any a priori assumptions about the fitness landscape. Fitness graphs and shapes are 
well suited for empirical studies for that reason. 
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